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,— 1 -Abstract 

o ■ 

CNI Beginning with the 2002 discovery of the "Amati Relation" of GRB spectra, there has been much interest in the 
^^possibility that this and other correlations of GRB phenomenology might be used to make GRBs into standard candles. 
^3 One recurring apparent difficulty with this program has been that some of the primary observational quantities to 
,be fit as "data" — to wit, the isotropic-cquivalent prompt energy Ei SO and the collimation-corrccted "total" prompt 
energy energy E 1 — depend for their construction on the very cosmological models that they are supposed to help 
constrain. This is the so-called "circularity problem" of standard candle GRBs. This paper is intended to point out 
that the circularity problem is not in fact a problem at all, except to the extent that it amounts to a self-inflicted 

i 1 wound. It arises essentially because of an unfortunate choice of data variables — "source-frame" variables such as 

'Ei SO , which are unnecessarily encumbered by cosmological considerations. If, instead, the empirical correlations of GRB 
phenomenology which are formulated in source-variables are mapped to the primitive observational variables (such as 
fluence) and compared to the observations in that space, then all taint of circularity disappears. I also indicate here a set 
O i<"if procedures for encoding high-dimensional empirical correlations (such as between Ei SO , Ep£ , t^°\ and T^^) in a 
q ["Gaussian Tube" smeared model that includes both the correlation and its intrinsic scatter, and how that source- variable 
5^ model may easily be mapped to the space of primitive observables, to be convolved with the measurement errors and 
'fashioned into a likelihood. I discuss the projections of such Gaussian tubes into sub-spaces, which may be used to 
Ci incorporate data from GRB events that may lack some element of the data (for example, GRBs without ascertained 
jet-break times). In this way, a large set of inhomogeneously observed GRBs may be assimilated into a single analysis, 
fN| eo long as each possesses at least two correlated data attributes. 
> ' 

■^j- Keywords: Gamma Rays: Bursts, Cosmology: Cosmological Parameters, Methods: Data Analysis, Methods: 
CO Statistical 
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1. Introduction 

Since the earliest published evidence of tight correla- 
tions in gamma-ray burst (GRB) spectral properties (Am- 
[ati et al., 2002), there has been sustained interest in press- 
ing those correlations into service to make GRBs into stan- 
dard candles, which is the same office that the Phillips 
correlation performs for SN la (Phillips, 1993; Riess et al., 
1998; Goldhaber and Perlmutter, 1998). The intriguing 
possibility is that GRBs may open a window in redshift 
space (z ~ [1 — 8]) beyond what is provided by SN la 
studies, for the purpose of constraining the parameters 
that characterize Dark Energy (Dai et al., 2004; Ghirlanda 
et al., 2004; Friedman and Bloom, 2005; Liang and Zhang, 
2005; Firmani et al., 2005). 

The earliest correlation, the "Amati Relation" , discov- 
ered by Amati et al. (2002), was between the isotropic- 



* Corresponding Author 
Email address: carloOoddjob.uchicago.edu (Carlo Graziani) 



equivalent prompt emission energy Ei SO and the peak en- 
ergy E^ c ^ of the Band-function spectrum fit to the time- 
integrated prompt emission from the burst, boosted to the 
source frame by the expansion factor 1 + z. Other corre- 
lations were discovered in short order, ostensibly exhibit- 
ing tighter scatter that could make them more suitable for 
standardizing candles. Examples are the "Ghirlanda Rela- 
tion" (Ghirlanda et al., 2004), connecting the collimation- 
corrected prompt energy E~ ( and ; the "Liang-Zhang 

Relation" (Liang and Zhang, 2005), connecting Ei SO with 



a fit-determined function constructed from E, 

(src). 



' sre) 
pk 



and the 



source-frame jet-break time t^ et , and the "Firmani Rela- 
tion" (Firmani et al., 2006), analogous to the Liang-Zhang 
relation, but replacing the dependence on tfe^ with one 



on T^ rc \ the source- frame "emission time," which is a du- 
ration measure that robustly stands up to the diversity of 
duty cycles observed in prompt GRB emission (Rcichart 
et al., 2001). 

The later correlations of Liang and Zhang (2005) and 
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Firmani ct al. (2006) constitute considerable advances over 
the earlier work on constructing GRB distance indicators. 
By passing from (E y , E^ c ') to a 2-D projection of the 

space (Eigo, Ep£ c \tj ( ^), Liang and Zhang (2005) elimi- 
nated all reference to highly uncertain theoretical factors 
- the density of the ISM in the burst source neighbor- 
hood, and the conversion efficiency of kinetic energy to 
radiation in the afterglow — required to convert t^J"^ to 
a jet opening angle. This purged an important source of 
systematic error from the problem. Firmani et al. (2006) 
went a step further, passing to 2-D projections of the space 
(Eiso , , T^ rc ^ ) , which, by replacing the difficult- to- 

obtain jet-break time with the more easily measured prompt 
duration made many more GRBs available as potential 
standard candles. 

A fly in the ointment was noticed early on by several 
authors (Friedman and Bloom, 2005; Liang and Zhang, 
2005; Firmani et al., 2005): The Amati and Ghirlanda re- 
lations were calibrated assuming a standard concordance 
ylCDM cosmology. That is to say, it is not possible to 
construct quantities such as Ei SO or E 7 from the observed 
GRB prompt fluence S without reference to a specific cos- 
mological model to supply the luminosity distance. As the 
cosmological model is precisely what is to be constrained 
from the data, an inconsistency would appear to have been 
introduced into the problem. This is the (apparent) "Cir- 
cularity Problem" of GRB standard candles. 

Much effort and ingenuity has gone into the abatement 
of the circularity problem. Friedman and Bloom (2005) 
performed fits of the Ghirlanda Relation assuming a wide 
range (0 < Qmi^a < 2) of "fiducial" cosmologies, and 
used each such fit to infer confidence regions on the true 
parameters, reporting regions that bracket all the results. 
Aside from being rather conservative, it is difficult to un- 
derstand what sort of confidence probability is to be as- 
cribed to such intervals. This is problematic if confidence 
intervals from GRB studies are to be combined with those 
from other types of Dark Energy studies, such as SN la. 

The procedures adopted by Liang and Zhang (2005) 
and by Firmani et al. (2005) are, from a conceptual point of 
view, even more problematic. Liang and Zhang (2005) re- 
fit the correlation for each "fiducial" cosmology, to obtain 
a Xcorr- For each such fit to the correlation, they fit to 
the cosmological parameters, and re- weight the probability 
of the cosmological parameter fit by exp(— xl orr /2) — in 
effect, an ad hoc, tacit, and oddly data-dependent choice 
of prior. 

Firmani et al. (2005) explicitly embrace Bayesian logic, 
by interpreting the likelihood obtained for cosmology fl 
using Ghirlanda-relation fits obtained assuming "fiducial" 
cosmology £2 as a conditional probability P(f2\f2) — an in- 
terpretation that is both mathematically inconsistent (such 
an expression should be proportional to the Dirac delta 
function S(f2 — f))), and logically dubious (what informa- 
tion could cosmology £2 possibly supply about cosmology 
fil) They then eliminate fl from their results by marginal- 



izing this probability with some prior on Q. This removes 
the nuisance parameter J? from the final expressions, but 
does not correct the logical inconsistency that underlies 
the calculation. 

More recently, an "astronomical" fix has been proposed 
for the circularity problem. Liang et al. (2008) interpolate 
distance moduli from SN la at the same redshift (z < 1.4) 
to "train" the correlations at low redshift. This is not ter- 
ribly different from using nearby SN la to calibrate the 
Phillips relation for all SN la, and is not conceptually as 
problematic as some of the above approaches. However it 
is a rather weak solution, since the relation must be cali- 
brated using a small subset of GRBs, and since it means 
that GRB distance moduli can never even in principle be 
determined more accurately than SN la distance moduli. 
Moreover, if confidence regions on Dark Energy parame- 
ters obtained using such a calibration are to be combined 
with confidence regions obtained from SN la, a new hidden 
statistical dependence will have been introduced that will 
be difficult to characterize. 

It is unfortunate that so much effort has been thus ad- 
dressed to solving this problem. As I show below, there 
is no real circularity problem, and there never was. To 
the extent that a "problem" exists, it is, in effect, a self- 
inflicted wound, arising from an unfortunate choice of data 
variables — "source-frame" variables such as Ei SO and 
£Lp which are, by their construction, unnecessarily en- 
cumbered by cosmological considerations. If, instead, the 
empirical correlations of GRB phenomenology which are 
formulated in source- variables are mapped to the primitive 
observational variables such as fluence (so that the model 
may discharge its duty of predicting the data, without at 
the same time being obliged to assist in constructing it), 
then the circularity disappears, and the analysis may be 
carried out without fear of inconsistency or paradox. 

A recent paper by Basilakos and Perivolaropoulos (2008) 
addresses the circularity issue by making explicit the de- 
pendence of "data" such as E iso or E 1 on the cosmologi- 
cal parameter £2m in the expression for the log-likelihood, 
and allows both the "data" and the model to vary with 
the model parameters in the fit. As such, this work docs 
not make a clean separation between model and data, in 
the way that is in my opinion desirable. Nonetheless, for 
reasons that will be discussed in §2.2, the resulting formal- 
ism has some features that are similar to the one presented 
here. 

Concomitantly with the necessary disentangling of data 
from cosmological modeling, I show below how multi-dimensional 
correlations of the sort projected down to two dimensions 
by Liang and Zhang (2005) and Firmani et al. (2006) can 
be fully, and more informatively, modeled in the higher- 
dimensional space in which they reside, by a Gaussian 
Tube model, which represents the correlation together with 
its intrinsic scatter. The Gaussian nature assumed for the 
scatter yields the benefit of easy convolution with mea- 
surement errors to furnish a likelihood function that may 
be put to the usual inferential work. The Gaussian Tube 
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will be illustrated in this work by formulating it in the 



E 



4-D space of source-frame variables (t 
Ei SO ), and mapping it to the space of observables (t 
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spaces or to other observables is obvious and immediate. 

For those GRBs that are endowed with all four obser- 
vations, the full Gaussian Tube model is used to produce 
the event likelihood C{. For GRBs that are missing some 
measured observables, we may still calculate an event like- 
lihood by using the projection of the Tube model into the 
space of available observables, whether that be 2-D or 3-D 
(the projection onto 1-D is a uniform distribution, which 
is uninformative) . Thus it is possible to fit simultaneously 
to all GRBs for which at least two correlated observables 
are measured. This is a substantial technical advance, in 
that it was previously necessary to use separately samples 
of GRBs with different available measurements. 

Projection of the tube model has additional uses be- 
yond extending the data set. A mysterious multi-dimensional 
tube correlation model, however technically satisfying, is 
not persuasive unless one can verify that the data in fact 
justify the model. Fortunately, this is not hard to do. Once 
a best-fit cosmology Q has been obtained (or once we have 
fixed J? at the concordance model), we may project the 
best-fit Gaussian Tube model into various 2-D planes - 
exhibiting both its orientation and its Gaussian width — 
and superpose the applicable data in that plane, includ- 
ing measurement errors. We are thus able to exhibit the 
various existing 2-D correlations as different perspectives 
on the full, multi-dimensional correlation in a series of 2-D 
plots, and visually inspect the agreement with the data. 

The organization of the remainder of this paper is as 
follows: in §2 I introduce the variables in play, define some 
notation, and exhibit the Gaussian Tube model in techni- 
cal detail. In §3, I discuss the mathematical details asso- 
ciated with projection operations of the model into lower- 
dimensional spaces. In §4 1 discuss the procedures required 
to compare the model to data — formulation of the event 
likelihoods for the cases of full and partial data, and how 
the event likelihoods are (trivially) strung together into a 
full likelihood function for the ensemble of GRBs. In §5 I 
discuss the use of model projections to verify that the data 
in fact has a nodding acquaintance with the difficult-to- 
visualize, multi-dimensional model. A discussion of likely 
data requirements of the method presented here is in §6. 
Final discussion and conclusions appear in §7. 

2. The Gaussian Tube Correlation Model 

The Gaussian Tube is defined as a density which is 
Gaussian about a symmetry axis along the direction of the 
correlation, and invariant along that axis. The finite- width 
density is intended to represent the intrinsic scatter of the 
correlation. The model is a somewhat crude empirical ap- 
proximation, since it does not allow for the nature of the 
intrinsic scatter in the correlation to change as one moves 



up or down the symmetry axis. The benefit of the sim- 
plification is that the likelihood function of data endowed 
with Gaussian measurement errors may be computed ana- 
lytically, as I will show in §4. Some possibilities for moving 
beyond this simplification are indicated at the end of §5. 

It is convenient to work with the logs of the observables 
as primary quantities. Accordingly, we define 
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where z is the redshift of a particular GRB. In terms of 
this notation, the transformation from source variables to 
primitive observables of a particular GRB is simply 
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The transformation is thus an elementary shift, albeit one 
that is different for each GRB (since each is at a different 
redshift z). 

We will define the model in the space of x' src ', and use 
this relation to move it to the observable space when the 
time comes to compare the model to data. 

2.1. The Axis Of The Tube 

The symmetry axis of the event density distribution is 
easily defined in terms of elementary analytical geometry. 
The direction of of the tube axis is along a vector n, and 
the axis passes through a point xo, so that points on the 
axis arc defined by the parametric relation x = xo + in, 
for all real t. 

This parametrization is not unique, since n may be 
multiplicativcly rescaled, and xo may be shifted by a mul- 
tiple of n. In order to fix a non-dcgcncratc parametrization 
it is necessary to choose a definite scale for n and a definite 
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intercept for Xo. We will choose n T = [m, H2, n-3, l] 1 , and 
Xq = [xq,i, Xq ; 2, 2^0,3: 0]. Thus 6 parameters are required 
to specify the axis. 

2.2. The Gaussian Density 

The Gaussian Tube density is denoted by p(x^ src ^) d 4 ^ src \ 
where 



along the coordinate dual basis g„ (which is what we mean 
by the "matrix" B): 



p(x< src >) 



exp 
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i(x<»"- X0 ) T B( 
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and where B is a non-negativc-definite matrix. 

The eigenvectors of B are the principal directions of the 
ellipsoids of constant density. If the eigenvalue correspond- 
ing to a certain eigenvector should become very small, the 
ellipsoids will become very elongated along the correspond- 
ing direction. In the limit of an eigenvalue going to zero, 
the distribution will be infinitely elongated, becoming, in 
effect, a tube. The condition that the tube should be ori- 
ented along the direction n is therefore B • n = 0. 

We require a useful parametrization of B that will sat- 
isfy this condition. Such a parametrization may be exhib- 
ited by introducing dual vectors (linear maps from vectors 
to numbers) Wj, i = 1,2,3, satisfying Wj(n) = 0. Then, 
we may choose 



B = J2 ^ 
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where the b lJ are components of a positive-definite matrix. 
By construction, this B evidently satisfies B • n = 0. 

A convenient choice of the Wj may be specified in terms 
of the dual basis g u , v = 1, ... ,4, which is dual to the 
coordinate direction vectors e M , // = 1, . . . , 4, in the sense 
that g t/ (e' 1 ) = 5%. Then we may choose 



g* - riigi. 



(8) 



It is straightforward to verify that w;(n) = (recall that 
71,4 = 1 by convention). The w,; may be written in compo- 
nent form Wj = X)t=i w i k gk, where from Eq. (8) 

Wi k = 5 k i - ni 8l. (9) 

By substituting the components of the w, from Eq. (9) 
into Eq. (7), we may obtain the matrix components of B 



1 The more familiar scale choice of n • n = 1 (i.e. choosing a 
unit vector for n) is not particularly natural in this context. The 
reason is that there is no natural Euclidean metric defined in the 
space of obscrvables, and we have no particular reason to import 
one. The cost of the additional complexity introduced by a quadratic 
normalization convention is not offset by any benefit (such as, for 
example, a normalization that is invariant under a relevant class of 
reparametrizations) . 
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It remains to guarantee that the components b lJ pro- 
duce a matrix B satisfying x T • B ■ x > for all vectors x, 
and x T • B • x = only when x oc n. From Eq. (7), this is 
clearly equivalent to the condition that the matrix b with 
components b lJ should be a positive-definite matrix. A 
parametrization that guarantees this is the Cholesky De- 
composition L of b (see, e.g. Golub and Loan, 1989, p. 
141). This is the unique lower-triangular matrix with com- 
ponents satisfying La > and L lJ =0, j > i, in terms of 
which b = LL T , that is, 
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We therefore adopt the (3 x 4)/2 = 6 components L %3 
of L as the parameters which, together with n, control the 
quadratic form B. 

It is necessary at this point to be more definite about 
the normalization "constant" N that figures in Eq. (6). 
This normalization is of course constant with respect to 
the variables x. It is not constant with respect to the 
parameters L, however. This is because we must require 
that the model be somehow normalized, so that we can 
vary the shape of the tube (the L) without varying the 
predicted overall rate of GRB events. This is an essential 
feature of the model, without which the task of inferring 
the L from the data will certainly fail. 2 

The normalization must have the following property: 
the integral of p(x) on any 3-D hyperplane must be in- 
dependent of L. Loosely speaking, this guarantees that 
changing the width and "cross-sectional shape" of the Gaus- 
sian Tube does not change the overall event rate of pre- 
dicted GRBs. This allows us to decouple the aspects of 
the model that predict the correlation shape (which we 
care about) from the aspects that predict GRB event rates 
(which we do not). 

This normalization is easily exhibited: it is 



Y 1J/. . 
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which is just the square root of the determinant of the 
matrix b. This is roughly speaking l/(product of a over 



2 The failure would take the form of an inability of the likelihood 
function to prefer tight correlations to dispersed ones. Since the term 
in the exponential of the Gaussian is negative quadratic, and hence 
bounded above by zero, the fit to the data could simply proceed by 
making B — > (which makes the model a uniform density), reach- 
ing the maximum attainable likelihood irrespective of how bad the 
correlation really is. It is the job of the normalization "constant" to 
prevent this catastrophe. The normalization of Eq. (12) guarantees 
that the likelihood will decline to zero if B attempts to go to zero. 
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non-degenerate principal directions), which is the standard 
normalizing factor of Gaussian distributions (except for 
inessential 7r-related factors, which fortunately are con- 
stant). 

At this point, enough of the model is in view to allow a 
comparison with the work of Basilakos and Perivolaropou- 
los (2008), who consider separately various 2-D correla- 
tions. As pointed out in §1, there was no clean separation 
made between model and data in that paper. Nonethe- 
less, Basilakos and Pcrivolaropoulos (2008) also work with 
log-space observables, so that the relation between their 
source- and observer-frame variables is still given by an 
offset, as in Eq. (5). Since they use x 2 > which is a function 
of model-data difference, it is immaterial from the point 
of view of the formula whether the offset is applied to the 
model, as here, or the negative offset is applied to the 
data, as in their work. Note, however, that while x 2 is ef- 
fectively the argument of the exponential of the likelihood 
in Eq. (6), it does not represent the dependence of the 
likelihood on the parameters through the normalization 
N, which, as argued above, is an important omission. The 
effect of the omission may be seen from Eq. (5) of Basi- 
lakos and Pcrivolaropoulos (2008), where in the limit of the 
slope parameter a — > oo, the expression for \ 2 saturates 
at a constant value, leading to open confidence contours. 
It is precisely this circumstance that the normalization of 
Eq. (12) avoids. 

2.3. Sampling From A Gaussian Tube 

Methods for sampling from the distribution defined by 
a Gaussian Tube are obviously of some interest if one in- 
tends to simulate events from such a distribution. Sam- 
pling from the tube is not a straightforward exercise in 
multidimensional Gaussian sampling, as one might imag- 
ine upon contemplating Eq. (6), since the degenerate di- 
rection n complicates matters somewhat. 

Nonetheless there are no insurmountable difficulties or 
dispiriting complications here. The main idea is that one 
samples a vector x_l from a 3-D multivariate Gaussian in 
the space of vectors dual to the dual vectors w, — that 
is, in the 3-D vector space of equivalence classes of vec- 
tors differing only by a multiple of n (this is the so-called 
"Quotient Space" V/n of the full vector space V by the 
subspacc spanned by n) . One then samples a real number 
A from a uniform distribution in some chosen range. The 
full sample vector is x = xj_ + An + xo . 

Operationally, this is straightforward. From Eq. (8), it 
is apparent that we can choose as representative vectors 
for an orthogonal basis of the quotient space the vectors 
e», i = 1,2,3. The reduced matrix b v in Eq. (10) may 
be thought to operate on components of vectors expressed 
in this basis. That is to say, we may sample from a 3-D 
multivariate Gaussian with inverse covariance components 
given by b l ° , ascribing the components of the sampled vec- 
tors to the first three components of xj_ (whose fourth 
component is zero). One then proceeds from xi to x as 
described above. 



Note that the choices n± = 1, xo,4 = imply that a 
vector x sampled in this way satisfies X4 = A. Thus the 
chosen range of A is also the chosen range of £4 . 

The fact that we choose the Cholesky decomposition 
Ui to parametrize , so that b = LL T , is of some as- 
sistance here. If we sample three numbers s = (si, s%, S3) 
independently from a TD standard normal distribution, 
then the vector (L~ 1 ) T s is easily seen to be sampled from 
a Gaussian distribution with inverse covariance b, as re- 
quired. 

2.4. Summary Of The Model 

The 4-D Gaussian Tube model is therefore character- 
ized by 12 parameters: 6 parameters required to establish 
the location and orientation of the tube through the vec- 
tors n and xo , and another 6 parameters required to set up 
the actual Gaussian distribution about that axis, through 
the lower-diagonal matrix L, which is used to obtain the 
quadratic form B using Eqs. (7), (8), and (11). In a more 
general A-dimensional space of observables, the parameter 
count would be (N - 1)(JV + 4)/2. 

Including the normalization, the expression for the model 
density is 
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3. Projection 



As discussed in the Introduction, we require the abil- 
ity to project the Tube onto lower-dimensional subspaces. 
This may be for the sake of visualizing the correlation in 
2-D, or it may be in order to compare the model to the 
data from a GRB that is not supplied with all four possible 
observations. 

The process of "projecting" a 4-D correlation down to 
a subspace (such as a visualizable 2-D plane) is, in effect, 
marginalization over the remaining dimensions. This is a 
standard operation in Gaussian probability theory, which 
will now be briefly reviewed. 

Suppose, that we wish to project onto a subspace, by 
marginalizing the Gaussian Tube over the complement of 
the subspace. We partition all vectors and matrices into 
the two subspaces: 



n = 


ni 
n 2 


; x = 


xo,i 

X , 2 


; x = 


Xl 






B = 
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B 12 


B12 
B22 


(14) 



We will project onto subspace "1" , by marginalizing over 
subspacc "2". 
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The projected density is 
/ 3 \ 



\%=i 



dx 2 exp 



(xi-x ,i) Bn(xi~x ,i) 



+2(xi-x ,i) Bi 2 (x 2 -x 0j2 ) 
+ (x 2 - x , 2 ) T B 22 (x 2 - x , 2 



(15) 



The integral may be performed by completing the square. 
The result is 



7(xi) = ^n^) det i B 22r 1/2 



exp 



1 T 

--(xi-x ,i) Aii(xi-x ,i) 



where 



An = Bn — Bi 2 B 22 B i2 



(16) 



(17) 



The Gaussian quadratic form in the projected space is 
therefore An. Note that the term det |B 22 | -1 / 2 may not 
be dropped from the normalization of 77 (xi), for the same 
reasons that motivate respect for the parameter-dependence 
of the normalization of the full 4-D model density p(x). 

It is not hard to show that An ■ ni = 0, as expected. 
This is because the partitioned version of B ■ n = is 



Bn ni 

Bf 2 • m 

so that 



Bi 2 n 2 
B 22 n 2 



(18) 
(19) 



An • ni = Bn • ni — Bi 2 B 22 1 Bf 2 • ni 

= Bn • ni + Bi 2 B 22 1 B 22 • n 2 

= Bix • ni + B12 • n 2 

= B u • m - B u • ni 

= 0. 



(20) 



In summary, all there really is to know about projection 
is the partitioning trick: the projected Gaussian Tube has 
a direction ni and an offset Xo.i that are merely the appro- 
priate partitions of their higher-dimensional counterparts, 
and a quadratic form given by Eq. (17). 

4. Model-Data Comparison 

As was mentioned in the introduction, the comparison 
of model and data is necessarily to be carried out in the 
observable space, and not, as is unfortunately customary, 
in the source variable space. The reason is that this is the 
only sensible way to disentangle the cosmology from the 



data, and permit well-defined estimation of cosmological 
parameters. 

Suppose that the ith GRB (with precisely-determined 
redshift Zi) resulted in a measurement of the event's 
true observables x( obs ). We will not assume that all four 
observables are available to encode in y^. Instead, we will 
assume that y; is an n-dimensional vector, with 2 < n < 4 
(so, for example, if all that is available is and 5* 

then n = 2). We will also encode the measurement errors 
of the components of y^ as the matrix elements of an n x n 
diagonal matrix D^, defined as 



[E>i]kl = Ski Otl 



(21) 



where an is the measurement error on the Zth component 
of y 4 . 

The strategy for calculating the likelihood function of 
all the data is to calculate the event likelihood P(yi\zi, n, xo, L, 
Since the data for different GRBs is statistically indepen- 
dent, the total likelihood is the product of all the individual 
event likelihoods: 



N 



£(n,x ,L, O) = J^P(yj|z ? ;,n,xo,L,l2). 



(22) 



The problem is therefore reduced to the calculation of the 
event likelihood for each GRB. 

We first require the transformed model density in the 
full observable space, £( x ( obs )) d 4 x( obs ) = p(x<- mrc '>) d A y^ src \ 
This is easily obtained, given the redshift Zi, using the 
transformation of Eq. (5). As this transformation is a pure 
constant offset, its Jacobian is 1, and we have 



£(x (obs) ) 



(l[lA exp 



— -Ax T BAx 



where 



Ax = x (obs) -x -i{z h Q). 



(23) 



(24) 



If the ith GRB is endowed with all 4 observations, this 
is sufficient. If, on the other hand, n < 4, we must project 
£( X (°M) d own to the appropriate space. We adopt the 

partitioning x( ohs ) = [u T ,v T ], and project out v to ob- 
tain 7?(u) by the technique of §3, obtaining 



r)(u) 



det IB, 



-1/2 



where 



exp 



B, 



1 



Ax^A.n.Ax,, 



D uv D V v D uv' 



= U - X 0:U - fu(Zi, 



(25) 



(26) 
(27) 



and where the meaning of the partitioned vectors and ma- 
trices should be clear from context. 
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We may now convolve this distribution with the mea- 
surement error distribution on y.;. This is tantamount to 
integrating over the entire space u the probability that 
the true value of the observables should have been u and 
that the actual measured values should have been y. By 
a routine Gaussian integration, we obtain 



P(y 4 |z l; n,Xo,L, J?) 



1177(11) x 




yi ) T D 2 (u- yi ) 



det |B„ 



-1/2 



ization (which is true for Gaussian distributions) 3 , and 
calculate an approximate posterior density on the cosmol- 
ogy parameters P(f2\0) by maximizing the full posterior 
P(G, Q\0) with respect to G at every value of fl. 

The point is, £ is a genuine likelihood — the probabil- 
ity of some data given a model — and may be pressed into 
service in exactly the sort of ways that likelihoods are nor- 
mally used. The circularity concerns that derive from the 
use of "fiducial cosmologies" to create the source- variable 
"data" have been short-circuited by the simple expedient 
of calculating the probability of the data that is directly 
observed. 



-1/2 



-AyfQ.Ay, 



where 



Q, = Di — Dj (Dj + A uu ) 
Ay l = y» - x ,u - f u (2», 42). 



(29) 
(30) 



Eqs. (28), (29), and (30) are the final result for the 
event likelihood, in the case where data is incomplete and 
projection is necessary. Obviously, if the full set of observa- 
tions is available, projection is not necessary, and these for- 
mulas are to be applied by replacing A uu by B, det |B VV | 
by 1, x , u by x , and f u (z ?; , Q) by f(zi, Q). 

As complicated as these formulas may appear, they do 
not represent much of a computational challenge, as the 
determinants and inverses that are required are of symmet- 
ric, positive-definite matrices with dimensionality less than 
or equal to 4. Perhaps a slightly greater computational 
challenge is the code organization required to arrange for 
the capability of carrying out the projection/partition of 
relevant matrices and vectors along arbitrary subsets of co- 
ordinates. This is nonetheless a manageable programming 
task. 

With the event likelihood in hand, we may proceed 
to the calculation of the total likelihood C, as given by 
Eq. (22). 

And now, we're in business. For example, we may si- 
multaneously optimize £(G, J?) (where G represents the 
Gaussian Tube parameters) with respect to G and f2, ob- 
taining fully internally-calibrated point estimates of both 
sets of parameters, and perhaps even frequentist confi- 
dence intervals. 

We may also play Bayesian games, using some choice of 
prior over the parameters to trade C in for a posterior den- 
sity distribution P(G, fi\0), where O represents the obser- 
vations. We may then marginalize some of the parameters 
to produce Bayesian confidence regions on others. This 
may require a Markov Chain Monte Carlo approach, given 
the large number of parameters. Or, we may make the ap- 
proximation that marginalization is equivalent to extrem- 



5. Sanity Checking 

The program of data analysis outlined so far relies on 
ome rather abstract and difficult-to-visualize construc- 
tions. It is crucial that there should be some way to visual- 
ize the relationship between the model and the data, both 
to spot possible problems and to get an intuitive feeling 
for the predictive content of the model. 

Once a best-fit Gaussian Tube G and a best-fit cosmol- 
ogy (or the concordance cosmology) fi have been fixed, 
this is a straightforward thing to do. There are six 2-D 
planes that may be formed from the 4 available source 
variables. The best-fit Gaussian Tube model may be pro- 
jected according to the method of §3 onto each of these 
planes. The projected best-fit straight line and the 1 — a 
confidence region from the projected distribution may be 
plotted on each plane. Each GRB endowed with observa- 
tions that are representable on some of those planes may 
have those observations mapped to the appropriate plane 



(for example, a GRB with measured S, , and T L 



lobs) 



may be mapped to the 
planes) . 



X 



(sre) 
pk 



pk 
X%so 



X 



(sre) 



45 

and 



pk ^45 

We finally end up with a series of six plots, each one dis- 
playing the projected model and the mapped data. From 
these plots, it should be possible to visualize directly the 
properties of the various projected aspects of the corre- 
lation model, and the extent to which the best-fit model 
really respects the data. 

Besides this sort of visual verification of the various 
2-D correlations against the data, there is another model 
verification issue that merits consideration. The observed 
rcdshift distribution of GRBs drops dramatically below 
z < 1, where most SN la redshifts are found, and ex- 
tends out past z = 6. This is an opportunity, of course, 
since it means that GRB-dcrived confidence regions in, say, 
the Qm — Q A plane cut across those derived from SN la 
(Ghirlanda et al., 2006), furnishing tighter constraints on 
those parameters. However, the much broader range of 
GRB redshifts raises a troubling question: Even if we find 



3 Note, however, that the posterior probability density over model 
parameters can at best be only approximately Gaussian, despite the 
Gaussian nature of the GRB density model. 
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a reasonable-seeming fit of the Gaussian Tube model's pro- 
jections to the data, how do we know whether the proper- 
ties of the tube should be considered to have evolved with 
rcdshift? That is to say, is it reasonable to assume, as the 
model does, that the correlations of GRB energetics follow 
the same distributions irrespective of redshift? If so, how 
do we know? If not, how would this affect the inferred 
values of the cosmological parameters ffl 

This question cannot be addressed merely by inspect- 
ing the projection plots described above, since the rcdshifts 
are all intermingled in those plots. Instead, it seems ad- 
visable to adopt a model-comparison strategy, wherein the 
"vanilla" Gaussian Tube model described above is com- 
pared to more complicated models (via a frequentist like- 
lihood ratio, or a Bayesian test based on posterior odds 
ratios) that allow the correlation parameters to vary with 
redshift. That is, we may introduce another hierarchical 
level in the model by allowing some of the parameters G 
to be some parametrized empirical function of z (a lin- 
ear function is an obvious thing to try), and calculate the 
amount by which the log-likelihood (say) is improved in 
this model over a model in which the G are the same for 
all rcdshifts. Significant improvements would be evidence 
for evolution of the distributions. Additionally, significant 
shifts in the confidence regions in the Z2m — plane in 
the more complicated model could be interpreted as an in- 
dication of trouble, whereas stability of those contours as 
the new parameters are varied would be a reassuring sign 
of robustness of the results. 

Clearly this approach entails some considerable expan- 
sion of the parameter space. A somewhat more modest 
approach, similar to the calibration approach of Basilakos 
and Perivolaropoulos (2008), is to partition the GRBs into 
a small number of bins, fit them separately, and determine 
whether the sum of the log likelihoods is significantly bet- 
ter than the log-likelihood for the full sample. Again, any 
significant shifts in Qm — Qa contours, or lack of such 
shifts, would be telling of the robustness of the inferences 
drawn from the model. 

6. Data Requirements 

This paper is a "methods" paper, and I have not yet 
attempted to collect a carefully-calibrated sample of GRB 
data to subject to this analysis. 4 I can therefore not sug- 
gest precise guidelines as to how many GRBs, bearing 
what kind of information, may be necessary to obtain in- 
teresting constraints on cosmological parameters using the 
present methodology. Nonetheless the question is worth 



4 The catalog of Schaefer (2007), with its many arbitrary man- 
ual adjustments to compensate for missing data, is probably not 
adequate for this purpose. The small statistical errors that would 
result from its large size (69 events) would probably not compensate 
for the large systematic errors introduced by the data aggregation 
procedure. 



addressing, so I offer a few tentative thoughts on the mat- 
ter. 

The "vanilla" (i.e. not redshift-dependent) Gaussian 
Tube model presented here has 12 free parameters, and 
operates on 2 to 4 observable quantities per GRB. In ad- 
dition, a minimally interesting cosmological model offers 
two additional parameters for constraining (Om an d Qa), 
so that a total of 14 parameters must be managed in the 
fit. 

Consider the Tube parameters G first. The role of these 
12 parameters is to ensure that the 6 2-D projections of 
the model adequately fit the projections of available data 
into those planes. The model is more compact than a 
model composed of 6 2-D Tube models (which would re- 
quire 18 parameters). Therefore, a conservative estimate 
of the amount of data required to constrain the full Gaus- 
sian Tube model is the amount required to constrain the 6 
independent 2-D Tube models. In each plane, this number 
would depend on the tightness of the correlation, the size 
of the measurement errors, and the dynamic range of the 
data. In the case of the original Amati relation (Amati 
et al., 2002), with 10 constrained data points, measure- 
ment errors in the 10-30% range, and a dynamic range in 
Ei SO of nearly 3 orders of magnitude, the fit parameters 
that resulted had a statistical error of about 10%. Sup- 
pose, then, for the sake of making a conservative estimate, 
that we require 15 points in each plane for adequate con- 
straints on G. The number of events required to furnish 
this information is bounded below by 15 (if all events bear 
all information, so that 60 numbers are used), and above 
by 90 (if all points on all planes are due to different GRBs, 
so that 180 numbers are used). 

Turning to the cosmological parameters, one may ob- 
serve that initially, the inflation of the Tube parameter 
count from 3 (for a single 2-D correlation such as the 
Ghirlanda relation) to 12 adds uncertainty to the contours 
in the S7m~Qa plane, uncertainty which must be made 
up by adding data that constrains those additional Tube 
parameters. If the data is in fact constraining on those 
additional parameters, then one may expect the statisti- 
cal errors on cosmological parameters to shrink roughly as 

— 1/2 

N pair , where N pa i r is the number of independent pairs 
of event data (i.e. the total number of points in the 6 
projected planes). This is the point of the exercise: by 
passing to the Gaussian Tube model, one pays a price in 
parameter count inflation, in the expectation that one will 
reap a dividend through the larger and more informative 
dataset that thereby becomes accessible. 

In other words, the design of this framework requires a 
cost/benefit analysis. It is not necessarily the case that the 
particular choice of observables made in this work for the 
sake of illustration is optimal. Possibly a different set, or a 
smaller or larger set, might be preferable. Much depends 
on visual inspection of correlations. If one or more of the 
2-D projections of the data appear not to show evidence 
for a strong correlation, it may be the case that the pa- 
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rameters controlling the correlation in that projected plane 
may be adding more noise than signal, and it might be a 
good idea to change observables, or to freeze the respon- 
sible parameters at some harmless value. On the other 
hand, reasonably clear correlations of the data in all pro- 
jected planes would constitute evidence for a good choice 
of observables, one which is likely to reward the analysis 
with statistical errors on cosmological parameters that are 
smaller in consequence of more abundant data, and that 
shrink more rapidly with increasing data than would errors 
inferred from a lower-dimensional model. 

7. Discussion 

It is my hope that readers are persuaded that the cir- 
cularity problem of GRB standard candle enterprise was 
merely a diversion, an own-goal brought about by an un- 
fortunate choice of space for model-data comparison, and 
readily corrected by making a better choice. The vari- 
ous fix-ups for the "problem" that have been proposed 
in the literature, and which were discussed in §1, are not 
merely unnecessary: by taking an excessively conservative 
attitude towards parameter constraints, or actually intro- 
ducing incoherent features to their statistical model, they 
almost certainly do more harm than good. 

The view advocated here is that the various correla- 
tions that are discussed in the literature must necessar- 
ily be projected aspects of a higher-dimensional "super- 
correlation". At its root, this is really no more than the 
observation that if A is correlated with B, and B is corre- 
lated with C, then A is necessarily correlated with C, and 
a correlation structure must therefore exist in the joint 
space of A, B, and C. 

I cannot say at this stage what the various correla- 
tions look like in all six 2-D planes that may be con- 
structed from the present variables. However, it would 
be difficult to understand if they weren't about as tight 
as the Amati relation, unless there is something wrong 
with the Ghirlanda/Firmani/Liang-Zhang/ctc. correla- 
tions, which, as I explain below, I do not believe. Turning 
this around, however, there is a very interesting possibil- 
ity: exhibiting correlations in alternative planes — includ- 
ing some built from burst durations and afterglow break 
times — strengthens the case for the reality of all these 
correlations, in the sense that it is difficult to imagine a 
selection effect of such perversity that it can produce both 
Epk Eiso and ije{ ^45 correlations (for example). 

An additional remark concerning projections seems ap- 
posite. It is possible to search for 2-D projections that are 
not necessarily along the coordinate axes, which in some 
sense minimize scatter in the data. The Ghirlanda, Liang- 
Zhang, and and Firmani relations are of this character. 
All that is required is to effect linear transformations of 
the coordinate axes, together with the corresponding sim- 
ilarity transformations on all matrices. One could imagine 
searching for the linear transformation that makes a corre- 
lation in a 2-D projection look maximally tight. However, 



it seems to me that the motivation for doing so is not as 
strong in the current picture as it once was, since all the 
content of these relations is already embodied in the best- 
fit Gaussian Tube model. Certainly, the construction of 
such a transformation would have no effect whatever on 
the likelihood function computed above, or on any of the 
cosmological inferences drawn therefrom. 

This remark underlines the essential fact that the most 
suitable space for visualizing the relationship between model 
and data is not necessarily the most suitable space for an- 
alyzing that relationship. It was the failure to understand 
this truism of data analysis that gave rise to the circularity 
problem in the first place. 

While the reality of these correlations has been harshly 
questioned (Nakar and Piran, 2005; Band and Preece, 2005; 
Butler et al., 2007), in my opinion the assuredness of these 
critiques is out of all proportion to the questionable co- 
gency of the evidence upon which they rest. It should 
be kept in mind that in order to even observe the correla- 
tions, the essential requirements are (a) rapid, accurate as- 
trometry (to furnish afterglow rcdshifts), and (b) accurate 
broadband spectroscopy (to obtain the essential spectral 
fit parameters). 

The critiques of Nakar and Piran (2005) and Band and 
Preece (2005) rely upon the fits of BATSE spectral data re- 
ported in Band et al. (1993), despite the fact that BATSE 
had no afterglows. Furthermore, as attested by columns 
5, 6, and 7 of Table 4 of Band et al. (1993), many of those 
spectral fits were of questionable quality. 

The critique of Butler et al. (2007), relies purely on 
Swift spectral fits, but as Swift's bandpass is essentially 
20-120 kcV, there is no possibility of securing actual spec- 
tral fit parameters. BATS-E-informcd priors must there- 
fore do some extremely heavy lifting. Nonetheless, Butler 
et al. (2007) find an Amati Relation correlation in Swift 
data, with the correct slope, but with the wrong normal- 
ization. Curiously, rather than conclude that their priors 
might be exerting some uncontrolled influence, they infer 
instead that the inconsistency exposes the Amati relation 
as being due to a somewhat vaguely-specified instrumental 
selection effect. 

Meanwhile, every analysis based on data from instru- 
ment complements capable of both prompt, accurate as- 
trometry and accurate broad-band spectroscopy, such as 
from BeppoSAX (Amati et al., 2002) or from BETE (Sakamoto 
et al., 2005) has found that with the exception of a small 
number of conspicuous outliers (such as the under-luminous 
GRB980425), new data invariably drapes itself across the 
old, known correlations. In addition, analysis of time- 
resolved spectroscopy of selected BATSE bursts by Liang 
et al. (2004) showed that flux and E p k are Amati-correlated 
within the time history of each GRB. Finally, Ghirlanda 
et al. (2009), using 12 long GRBs jointly observed by 
Swift and by Fermi/GBM (with GBM supplying the spec- 
troscopic coverage), not only confirm the time- resolved 
GRB-personalized mini- Amati relations of long GRBs dis- 
covered by Liang et al. (2004), but also show that the 



9 



normalizations of those mini-relations actually place them 
on the BeppoSAX/HETE Amati relation, with the Bcp- 
poSAX/HETE parameters (and, of course, the time-integrated 
spectral parameters of all 12 events also fall on the Bcp- 
poSAX/HETE Amati relation). 

This debate would appear to be over: the various long 
GRB phenomenological correlations, are (except for a small 
fraction of conspicuous outliers) convincingly confirmed, 
and appear to be manifestations of "internal" features of 
GRB emission. They must certainly be taken seriously. 
Given that Swift and Fermi/ GBM appear capable of pro- 
ducing about a dozen joint events with the required spec- 
tral data per year (Ghirlanda et al., 2009), and given that 
one may expect that a sample of GRBs of about 150 events 
may make an impact on Dark Energy studies comparable 
to that of SN la (Ghirlanda et al., 2006), it is possible 
that GRBs may be put to useful cosmological work sooner 
rather than later. 
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